-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support max_runtime in scheduler #6785
Conversation
9cfcd4d
to
56b2ed2
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6785 +/- ##
==========================================
+ Coverage 83.90% 83.91% +0.01%
==========================================
Files 365 365
Lines 21410 21428 +18
Branches 948 948
==========================================
+ Hits 17964 17982 +18
Misses 3152 3152
Partials 294 294 ☔ View full report in Codecov by Sentry. |
src/ert/scheduler/job.py
Outdated
self.real = real | ||
self.started = asyncio.Event() | ||
self.returncode: asyncio.Future[int] = asyncio.Future() | ||
self.aborted = asyncio.Event() | ||
self._scheduler = scheduler | ||
self._callback_timeout: Optional[Callable[[int], None]] = callback_timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this function needed and do we need this function as an attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The timeout is set up by _legacy.py
and we need to keep in sync with how the existing job_queue operates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can have a
if isinstance(self._job_queue, Scheduler):
in _legacy.py
? I reckon that is less dirty than having this callback_timeout
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed!
src/ert/scheduler/job.py
Outdated
self.real = real | ||
self.started = asyncio.Event() | ||
self.returncode: asyncio.Future[int] = asyncio.Future() | ||
self.aborted = asyncio.Event() | ||
self._scheduler = scheduler | ||
self._callback_timeout: Optional[Callable[[int], None]] = callback_timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can have a
if isinstance(self._job_queue, Scheduler):
in _legacy.py
? I reckon that is less dirty than having this callback_timeout
.
I added a commit that refactored the PR to use asyncio.wait_for, but I am not sure I liked that alternative better than the _max_runtime_task. |
src/ert/scheduler/job.py
Outdated
threadsafer_future_wait(self.returncode), | ||
timeout=self.real.max_runtime, | ||
) | ||
except asyncio.exceptions.TimeoutError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asyncio.TimeoutError
When studying both approaches, I'm bit more inclined (since it cleaner tbh) towards having a separate task that just asyncio.sleeps and cancel jobs in the end. Also wait_for might be just another version of the same approach anyway. |
ab0c709
to
445d9d1
Compare
This highlights a behavioural change in the new LocalDriver, it will not send the same events as the legacy local driver, see test_async_queue_execution.py::test_happy_path The new scheduler will not catch bare exceptions for now, and thus the test for that situation is only applied for the legacy JobQueue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Issue
Resolves #6784
Approach
Short description of the approach
(Screenshot of new behavior in GUI if applicable)
Pre review checklist
Ground Rules),
and changes to existing code have good test coverage.
Pre merge checklist